Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #176
This PR introduces a necessary feature to prevent message history from growing indefinitely, which previously led to high operational costs and frequent context length errors in LLM calls.
We now enforce a configurable token limit, ensuring better stability and cost control.
Key Changes
Token Limit Enforcement:
Added a new setting, MAX_TOKEN_LIMIT (default: 100,000 tokens), in src/ansari/config.py.
Integrated the tiktoken library to accurately count tokens in the message history (src/ansari/agents/ansari.py).
The process_message_history function now checks the token count. If the limit is exceeded, the agent gracefully refuses to process the request and prompts the user to start a new conversation.
Bug Fix:
Resolved a circular import issue in src/ansari/app/main_api.py by reordering imports. This ensures the server starts correctly when whatsapp_router is included.
Testing
Added new unit tests in tests/unit/test_token_limit.py to verify that the token limit is enforced correctly (blocking long histories while allowing short ones).